Learning Tetris Using the Noisy Cross-Entropy Method
نویسندگان
چکیده
The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.
منابع مشابه
Cross-Entropy Method for Reinforcement Learning
Reinforcement Learning methods have been succesfully applied to various optimalization problems. Scaling this up to real world sized problems has however been more of a problem. In this research we apply Reinforcement Learning to the game of Tetris which has a very large state space. We not only try to learn policies for Standard Tetris but try to learn parameterized policies for Generalized Te...
متن کاملNotes Improvements on Learning Tetris with Cross-entropy
For playing the game of Tetris well, training a controller by the cross-entropy method seems to be a viable way (Szita and Lőrincz, 2006; Thiery and Scherrer, 2009). We consider this method to tune an evaluation-based one-piece controller as suggested by Szita and Lőrincz and we introduce some improvements. In this context, we discuss the influence of the noise, and we perform experiments with ...
متن کاملImprovements on Learning Tetris with Cross Entropy
For playing the game of Tetris well, training a controller by the cross-entropy method seems to be a viable way (Szita and Lőrincz, 2006; Thiery and Scherrer, 2009). We consider this method to tune an evaluation-based one-piece controller as suggested by Szita and Lőrincz and we introduce some improvements. In this context, we discuss the influence of the noise, and we perform experiments with ...
متن کاملTetris-: Exploring Human Performance via Cross Entropy Reinforcement Learning Models
What can a machine learning simulation tell us about human performance in a complex, real-time task such as TetrisTM? Although Tetris is often used as a research tool (Mayer, 2014), the strategies and methods used by Tetris players have seldom been the explicit focus of study. In Study 1, we use cross-entropy reinforcement learning (CERL) (Szita & Lorincz, 2006; Thiery & Scherrer, 2009) to expl...
متن کاملApproximate Dynamic Programming Finally Performs Well in the Game of Tetris
Tetris is a video game that has been widely used as a benchmark for various optimization techniques including approximate dynamic programming (ADP) algorithms. A look at the literature of this game shows that while ADP algorithms that have been (almost) entirely based on approximating the value function (value function based) have performed poorly in Tetris, the methods that search directly in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neural computation
دوره 18 12 شماره
صفحات -
تاریخ انتشار 2006